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Abstract 

Answering a question of Haugland, we show that the pooling problem with one pool and a bonnded 
number of inputs can be solved in polynomial time by solving a polynomial number of linear programs 
of polynomial size. We also give an overview of known complexity results and remaining open 
problems to further characterize the border between (strongly) NP-hard and polynomially solvable 
cases of the pooling problem. 
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1 Introduction, motivation and problem definition 

The pooling problem is a nonconvex nonlinear programming problem with applications in the refining 
and petrochemical industries [3 [16], mining mm. agriculture, food manufacturing, and pulp and paper 
production m- Informally, the problem can be stated as follows: given a set of raw material suppliers 
(inputs) and qualities of the material, find a cost-minimizing way of blending these raw materials in 
intermediate pools and outputs so as to satisfy requirements on the final output qualities. The blending 
in pools and outputs introduces bilinear constraints and makes the problem hard. 

While the pooling problem has been known to be hard in practice ever since its proposal by Haverly 
in 1978 [in], it was only formally proven to be strongly NP-hard by Alfaki and Haugland in 2013 [T]. 
Their proof of strong NP-hardness, however, considered a very general case of the problem, with arbitrary 
parameters and an arbitrary network structure. Once the parameters and the network structure are more 
specific, e.g., by bounding the number of vertices, their in- and out-degrees, or the number of qualities, 
the complexity of the problem needs to be re-examined. This way, several polynomially solvable cases 
of the pooling problem were proven EKUKls]. However, the border between (strongly) NP-hard and 
polynomially solvable cases of the pooling problem is still only partially characterized. This is mainly 
due to the combinatorial explosion of parameter choices for the problem. In this paper, we solve an open 
problem that has been pointed out in [HdS]: the pooling problem with one pool and a bounded number 
of inputs is in fact polynomially solvable. 

We consider a directed graph G = {V, A) where V is the set of vertices and A is the set of arcs. V is 
partitioned into three subsets I, L, J C V: I is the set of inputs, L is the set of pools and J is the set 
of outputs. Flows are blended in pools and outputs. The pooling problem literature addresses a variety 
of problem instances with AC (/ x L) U (L x L) U (L x J) U (/ x J). Instances with AC {L x L) — % 
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Sets 


Table 1: Notation for the pooling problem 
Parameters 


V Set of vertices 

/ Set of inputs 

L Set of pools 

J Set of outputs 

K Set of qualities 

A Set of arcs 

Aj Set of input-to-pool arcs: 

Ai ■= Af\{I X L) 

Aj Set of pool-to-output arcs: 

Aj ■= Ar\{Lx J) 

Set of outgoing arcs of u € / U L 
Set of incoming arcs oiv € LVJ J 


Ca Cost of flow on arc a € A 
Xik Quality value of input i G I ioi quality k G K 
Xak Xak = Xik, o. G , i G I, k G K 
^jk Upper bound on quality value of output j G J for 
quality k G K 

Cy Upper bound on total flow through vertex v G V 
Ua Upper bound on flow on arc a G A 

Variables 

Xa Flow on arc a G Aj 
Ha Flow on arc a G Aj 

ptk Quality value of pool ^ G L for quality k G K 
Pak Pak = Pik, a G i G L, k G K 


are referred to as standard pooling problems (SPPs), and instances with A n (L x L) 0 are referred to 
as generalized pooling problems (GPPs). Both SPPs and GPPs can be modelled as bilinear programs, 
which are special cases of nonlinear programs. Instances with L = 0 are referred to as blending problems 
and can be modelled as linear programs. 

In this paper (as in laiiiiis]), we study the complexity of SPPs where AC {I x L)U{Lx J), i.e., all arcs 
are either input-to-pool or pool-to-output arcs. For notational simplicity, we denote the set of the former 
by Aj := An {I X L) and the set of the latter by Aj ■.= An {Lx J). We do not consider input-to-output 
arcs since for every such arc {i,j), we can add an auxiliary pool ^ and replace {i,j) by an input-to-pool 
arc {i, t) and a pool-to-output arc {(■,])■ Throughout this paper, we use the term pooling problem to refer 
to a SPP without input-to-output arcs. We consider a set of qualities K whose quality values are tracked 
across the network. We assume linear blending, i.e., the quality value of a pool or output for a quality 
is the convex combination of the incoming quality values weighted by the incoming flows as a fraction of 
the total incoming flow. 

For inputs and pools v G I U L, we denote the set of outgoing arcs of v by and for pools and 

outputs V G LL) J, we denote the set of incoming arcs of v by A™. Let Xa be the flow on input-to-pool 
arc a G Aj, and let ya be the flow on pool-to-output arc a G Aj. The cost of flow on arc a G A (which 
may be negative) is given by Ca- The total flow through vertex v G V (resp. the flow on arc a G A) 
is bounded above by Cy (resp. Ua)- For every input i G I and quality k G K, the quality value of the 
incoming raw material is given by Xik ■ Let pjk denote the quality value of the blended raw materials in 
pool £ G L for quality k G K. For every output j G J and quality k G K, the upper bound on the quality 
value of the outgoing blend is given by p,jk- In addition to Xik and pik, it is sometimes more convenient 
to have arc-based rather than node based quality parameters and variables. Since the quality of flow on 
arc (v, w) is equal to the blended quality of the total flow through vertex v, we have Xik = Xak for all 
inputs i G I, their outgoing arcs a G and qualities k G K. Analogously, we have pik = Pak for all 
pools i G L, their outgoing arcs a G and qualities k G K. Table [T] summarises the notation for the 
pooling problem. 

We now present the classical formulation of the pooling problem, commonly referred to as the P- 
formulation |15) . There are numerous alternative formulations of the pooling problem, including the 
Q- SI, pQ- sa and HYB-formulations [3] , and most recently multi-commodity flow formulations SHE]- 
All formulations are equivalent in the sense that there is a one-to-one correspondence between a feasible 
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solution of one formulation and another, and they all have the same optimal objective value. However, 
the alternative formulations often show a better computational performance than the P-formulation, as 
studied e.g. in [5]. A recent paper by Gupte et al. m gives an excellent overview of topics that have 
been studied in the context of the pooling problem. Within the scope of this paper, however, we chose 
to prove complexity results using the classical P-formulation. 

In the P-formulation, a flow {x,y) satisfies the following constraints: 


^a= Va, 

£g L, 

(1) 





i G /, 

(2) 





i G L, 

(3) 

a^A^p 



Va ^ 

3 e J. 

(4) 

aGAf 



^a-i Va ^ ^a-i 

a G A/, Aj, resp. 

(5) 


Constraint o is flow conservation which ensures that at every pool, the total incoming flow equals the 
total outgoing flow. are vertex capacity constraints and ([5]) is an arc capacity constraint. For 

notational simplicity, we denote the set of flows by T := {ix,y) G x : (P)”® are satisfied}. 
The P-formulation can now be stated as follows: 


min 

x.y.p 

CaXa + 


"a 2/a 



a^Ai 

a^Aj 




s.t. 

{.x,y) 






^ ^ XakXa 

= Plk 

E 2/“’ 

£ G L, k G K, 

(6) 


aeAf 






^ ^ PakVa 

V/ 

E 

j G J, kG K. 

(7) 


a&Af 


a&Af 




Equality ® is the pool blending constraint which ensures that the p variables track the quality values 
across the network. Inequality © is the output blending constraint. We take the requirements that 
Xak = Xik for all a G i G I and k G K, and that Pak = Pek for all a G A™*, £ G L and k G K, to be 

implicit in the model. 


2 Known complexity results 

Table [2] provides an overview of known complexity results, and Figure [T] shows most of these complexity 
results in a tree structure. All of these results were formally proven in piiniiiiiis]. When bounding the 
number of vertices, the cases of one input or output are polynomially solvable. Furthermore, the cases 
of one pool and a bounded number of outputs or qualities are polynomially solvable. If we only have one 
pool (and no other restrictions), then the problem remains strongly NP-hard. The same holds if we have 
only one quality. The problem remains strongly NP-hard if we have one quality and two inputs or two 
outputs. Only if we have one quality, two inputs and two outputs, then the problem becomes NP-hard. 
The problem also remains strongly NP-hard if the out-degrees of inputs and pools are bounded above 
by two, or if the in-degrees of pools and outputs are bounded above by two. Finally, it was shown in 
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|10j that there exists a polynomial time algorithm which guarantees an n-approximation (where n is the 
number of output nodes). The authors of this paper also showed that if there exists a polynomial time 
approximation algorithm with guarantee better than for any e > 0, then NP-complete problems 
have randomized polynomial time algorithms. 


Pooling problem 


bounded ^vertices 


|J| = 1 |L| = 1 

@ _ (sNP 


[l^^max] [IjJmax] [l;^inax 

0 0 0 


this paper 


\K\ = 1 bounded in-/out-degrees 
[sNP] 

|J| = 1 \I\>2^=2 |ArK2, |4"K2, 
fF) (sNP) [sNP] v &I^ L u gLU J 

(sNP) [sNP] 


|jj^2 17^2 

(nF) (nF) 


Figure 1: Overview of known complexity results in a tree structure. For simplicity, we omit #11 and 
#14 from Tabled! 
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Table 2: Overview of known complexity results 



bounded ^vertices 


bounded in-/out-degrees 




# 

|/| 

\L\ \J\ 

\K\ 

Vie/ \/£eL yjeJ 

Complexity 

Reduction 

Reference (s) 

1 

1 




© 


trivial 






fsNP] 


[ 2 ], Corollary 1; 

2 


1 



MIS 

[T2|. Proposition 1; 
[13j. Theorems 1-2 

3 


1 



© 


trivial 

4 



1 


[sNP) 

X3C 

see #8, #9 and #11 

5 

[I7 ^max 

1 



© 


this paper 

6 


1 [l^imax] 



© 


|12|. Proposition 2 

7 


1 

[I7 ^max] 


© 


[ 2 ], Proposition 2; 

|12|. Proposition 3 

8 

2 


1 


[sNP) 

X3C 

[T3|. Theorem 4 

9 


2 

1 


[sNP] 

X3C 

[13j. Theorem 5 

10 

2 

2 

1 



BP2 

|12|. Proposition 5; 
|13|. Theorem 3 

11 

mm 

{|JU^|} = 2 

1 

max{|Ai“|,|An}<6 

[sNP] 

X3C 

[13], Corollary 1 

12 




|Ar*K2 |Ar*K2 

fsNP) 

MAX 2-SAT 

[12j. Proposition 7; 
1131. Theorem 6 

13 




lA^I < 2 \Af\ < 2 

(sNP) 

© 

MIN 2-SAT 

[ 12 ], Proposition 6; 
|13|. Theorem 7 
[To], Corollary 1; 

14 




min{|Af|,|An} = l 


[12], Proposition 4; 
[13j. Proposition 3 


= polynomial, [NP] = NP-hard, [sNP] = strongly NP-hard, 

BP2 = bin packing with 2 bins, MAX 2-SAT = maximum 2-satisfiability, MIN 2-SAT = minimum 2-satisfiability, 
MIS = maximal independent set, X3C = exact cover by 3-sets 










































3 The pooling problem with one pool and a bounded number 
of inputs 


In this section, we consider the pooling problem with 

• m inputs (let I = {wi ,... ,Vm}), 

• one pool (let L = {£}), 

• n outputs (let J = {rci ,... ,Wn\, 

• q qualities (let if = {1,, q}), 

• the set of input-to-pool arcs Aj = {oi,... ,am} = {{vi, , {vm,£)}, and 

• the set of pool-to-output arcs Aj = {am+i, ■ ■ ■, am+n} = {{£, wi), ■■■,{£, Wn)}- 

We write 

• Xi for the flow on input-to-pool arc ai (f = 1,..., m), 

• yj for the flow on pool-to-output arc am+j (j = ■ ,n), 

• Ci for the cost of flow on arc Oi {i = 1,... + n), 

• Xik for the /c-th quality value at the tail node of input-to-pool arc (i = 1,..., m), and 

• fjijk for the bound on the fc-th quality value at the head node of arc am+j [j = 1,..., n). 

For a positive integer TV, we use [N] to denote the set {1, 2,..., iV}. If for some j £ [n], there exists a 
k £ [g] such that minjAife : f G [m]} > y,jk, then yj = 0 in every feasible solution. Hence, without loss of 
generality, we assume 

Vj G [n] Vfc G [g] min{Aifc : f G [m]} < jijk. (8) 

Note that yj > 0 implies 

m m 

V/c G [g] E ^ik^i ^ f^jk E-- (9) 

It has been observed, for instance in m, that for a fixed set J' C [n] of outputs, an optimal solution 

that satisfies the quality constraints for all j £ J' and has yj = 0 for all j £ [n] \ J', can be found by 
solving the following linear program which we denote by LP( J'); 



/ ^ t ^ ^ J 

2 — 1 j^J' 

s.t. 

{x,y) £ F, 


m 

E^* = E Vr 



m —1 

^ ^ i^ik 

^mk)^i ^ (.f-^jk 


j £ J', k £ [g]. 


Let val(J') denote the optimal value of problem LP(J'). An optimal solution for the pooling problem 
can be obtained by solving LP(J') for every J' C [n], and choosing one with minimum val(J'). Below 
we argue that if the number m of inputs is fixed, then it is sufficient to consider a polynomial number of 
subsets J', where the polynomial is of degree m — I in both n and g. 
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Introducing variables Zi = ^ “ 1]) condition ([9]) can be rewritten as 

m —1 

Vfc e [g] E Tfik )Zi ^ fXjk A mk ■ (10) 

The vector z is an element of the simplex = {z G [0,1]™“^ : zi H - h Zm-i ^ !}■ For z G 

we define the reachable output set J{z) as 

J(z) = {j G [n] : (ITUl) is satisfied} . (11) 


Lemma 1. The objective value for any flow corresponding to z G A™ ^ is at least val( J(z)). 
Proof. For a fixed z G A™“^, we can find the optimal flow by solving the linear program 


mm 

x.y 


i=i jeJ(z) 

s.t. {x,y)GF, 

m 

E^*= E yi' 

i=l jeJ{Z) 

= zflxi H - h Xm), i G [to]. 

Every feasible solution for this problem is also feasible for LP(J(z)) and the claim follows. 


□ 


The inequalities (nni) define a partition of ^ (and therefore of A"* into regions of constant J{z). 
To be more precise, let Ti be the hyperplane arrangement Ti = {Hjk : j G [n], fc G [g]}, where 


m —1 


^mk 


^jk — \ 2; € M : ^ ^ (Ai/c — f^jk Ay) 

I 2-1 ) 

The system Ti induces a partition of Let and be defined by 

^jk ~ ^ ^ ^ ^ (Aifc ^ l^jk Attt,^ ^ , 

{ m —1 

-2 € M : ^ ^ (Ai/c ^mk)^i ^ l^jk Ay) 

2=1 

If, for every vector e = {£jk)j^[n],ke[q] G {Oj 1}”'^: we define the set 


^mk 


n q 

p(«)=n n ",‘f' 

i=i fc=i 

then the space is the disjoint union of the sets P(e), and for every z G A"*”^ the set J{z) is 

determined by the vector e with z G P{s). 

Lemma 2. For e G {0,1}"'^, let J(e) = {j G [n] : Vfc G [g] Sjk = 0}. Then, for all £ G {0,1}"^ and for 
all z G P{£) n we have J{z) = J{£). 
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Proof. Let e e {0,1}”« and 2 e P{e) n A^-^. Then 


j e J{z) 


Vfc G [q] 


zeH% 


ZGP(e 


Vfc G [g] Ejk = 0 


j e j{e). 


□ 


It is well known that the number of nonempty sets P{e) is bounded by a polynomial of degree m in 
nq (see for example [5]). However, direct application of [5] yields the upper bound (T)> '"^hich 

is weaker than the bound in the following lemma. We derive a stronger bound than [ 8 ] since the nq 
hyperplanes are partitioned into q subsets of each n parallel hyperplanes. 

m— 1 

Lemma 3. There are at most 

i=0 

Proof. We denote the claim of the lemma, parameterized by the input cardinality m and the quality 
cardinality q, by C{m,q), and we prove this claim by induction on m and q. Base case and inductive 
step are as follows: 

1. Base case: Vg, m G {1, 2,...} : (7(1, q), (7(2, q) and (7(m, 1) 

2. Inductive step: Vg G {2, 3,...}, Vm G {3,4,...} : (7(m — 1, g — 1) A (7(m, g — 1) (7 (to, g) 

For m = 1, note that IR.° = {0} contains only a single point, and since the sets P{e) are disjoint there can 
be at most 1 = {^n^ nonempty sets P{£). In fact, using assumption (|S]), we have P{£) 7 ^ 0 e = 0. 
For TO = 2, the nq inequalities partition into at most 1 + ng = intervals. For g = 1 and 

TO ^ 3, the n parallel hyperplanes Hu ,..., Hni partition into at most 1+n = (q)^-^ + {])n^ parts. 

Now let g ^ 2, TO > 3, and assume that (7 (to — 1, g — 1) and C{m, g — 1) are true. From (7 (to, g — 1) it 
follows that the system {Hjk ■ j G [n], fc G [g — 1]} cuts into at most 


n* vectors £ G {0, 1}"'^ such that P{e) 7 ^ 



parts. For every j G [n], the hyperplane Hjq is isomorphic to and for every j' G [n], fc G [g — 1], 

the intersection Hj/^ H Hjq is either empty or an (to — 3)-dimensional affine subspace of Hj^. Since the 
map Hjik Hjik 17 Hjq preserves parallelism, (7 (to — 1, g — 1) implies that the hyperplane Hjq is cut by 
the system {Hj^k 7 Hjq : j' G [n], fc G [g — 1]} into at most 



parts. If we start with the partition of given by the system {Hjk : j G [n], fc G [g — 1]} and add 

the hyperplanes Hiq, H 2 q,.. ■, Hnq one by one, then every hyperplane adds at most parts 

to the partition, and the number of parts into which is cut by H is at most 


m —1 

E 

2=0 


q-1 
i 


m—2 

-\-n 

2 = 0 


i 


m—l 

= E 

i=0 


g -1 

0 


g -1 

i 

.,0 


m—l 




i: 


g-l\ , /g -1 




i=0 


□ 


Remark 1. Note that the proof of Lemma [3] also provides a recursive method to determine the vectors 
£ with P(e) 7 ^ 0 in polynomial time. 


Remark 2. The upper bound given in Lemma [3] is best possible, i.e., for all m, q and n, there exist 
instances in which the number of vectors e with P(e) ^ 0 equals bound is 

obtained by almost all systems Ti. To make this statement more precise, we say that a system Ti of 
nq hyperplanes Hjk in consisting of q sets of n parallel hyperplanes, is in general position if the 

intersection of every set of m of these hyperplanes is empty and 

Vt e [m - 1] V(ji, [nY V(fci,. ..,kt)e [q]* with ki < k 2 < ■ ■ ■ < h 

n Hj^k 2 n • • • n Hj^kt is an (m — l — t)-dimensional affine subspace of 

The bound in Lemma [3] is obtained whenever the system T-l is in general position, and this can be seen 
by checking that in this case all estimates in the induction proof are tight. For m = 1, we have that 
P(0) = {0} 7 ^ 0. For m = 2, the system is a list of nq points, and TL is in general position if these 
points are distinct, in which case it partitions into 1 + nq parts as required. For q = 1, the n 

parallel hyperplanes Hu, ..., in general position partition into exactly 1 + n parts. For the 

inductive step, note that the system of intersections {Hjik H Hjq : j' € [n], k G [q — 1]} forms a system 
of hyperplanes in general position in Hjq, and therefore the inequalities in the inductive step are satisfied 
with equality. 

Theorem 1. For every positive integer m, the pooling problem with one pool and m inputs can be solved 
in polynomial time. More precisely, it can be reduced to solving at most 



linear programs with m + n variables and m + n{q + 1 ) + 2 constraints, where q is the number of qualities 
and n is the number of outputs. 

Proof. We claim that the pooling problem can be solved by choosing a minimum cost solution obtained 
from solving the problem LP(J(e)) for every e with P{e) fl 7 ^ 0, and by Lemma |3] the number of 

these linear programs is bounded as claimed. Clearly, B = min{val( J(£)) : P(e) n 7 ^ 0} is an 

upper bound because a solution for LP(J(e)) is always feasible for the pooling problem. By Lemma [3J 
for every 2 ; € A"*“^ there exists some e with J{z) = J{e), and using Lemma[I]it follows that B is also 
a lower bound. □ 

We note that this result was obtained, independently, by Haugland and Hendrix M- 


4 Remaining open problems 

To further characterize the complexity of the pooling problem, the following open problems could be 
addressed in the future [HIS]: 

1. For all the cases that can be solved in polynomial time by reduction to polynomially many linear 
programs of polynomial size, does there exist a strongly polynomial algorithm, i.e., an algorithm 
that is polynomial in the number of vertices and qualities? 

2. Is the pooling problem with one quality and in-degrees at most two polynomially solvable? 

3. Is the pooling problem with one quality and out-degrees at most two polynomially solvable? 

4. Do polynomial algorithms exist for the pooling problem with two pools and some bounds on the 
number of inputs, outputs, and qualities? 
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